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Abstract 

In Monte Carlo simulations, the thermal equilibria quantities are estimated 
by ensemble average over a sample set containing a large number of corre- 
lated samples. As the stochastic error of the simulation results is significant, 
it is desirable to understand the variance of the estimation by ensemble av- 
erage, which depends on the sample size (i.e., the total number of samples in 
the set) and the sampling interval (i.e., cycle number between two consecu- 
tive samples). Although large sample sizes reduce the stochastic error, they 
increase the computational cost of the simulation. In this work, we report a 
few general rules that relate the variance with the sample size and the sam- 
pling interval. These relations were observed in our numerical results. The 
main contribution of this work is the theoretical proof of these numerical 
observations and the set of assumptions that lead to them. 

Keywords: Monte Carlo simulation, phase coexistence, Gibbs ensemble, 
molecular simulation, multi-component system 



1. Introduction 

An essential part of many scientific problems is to evaluate an integral 
in a high-dimensional space with the integrand containing a weighting func- 
tion f{x) (probability distribution function of the configuration x) which is 
large in some area but close to zero almost everywhere else. The compu- 
tational cost of evaluating the integral by conventional quadrature schemes 
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is prohibitive since it demands a huge number of quadrature points inside 
a high- dimensional space. This integral can be estimated by the average 
value of the integrand over a large number of configurations sampled inside 
the domain randomly, independently and uniformly using Monte Carlo (MC) 
method. Metropolis and Ulam pQ (see |2]) dubbed this simulation method 
Monte Carlo since it uses a huge number of random numbers generated by a 
computer. The accuracy of the MC method can be improved by using the im- 
portance sampling scheme [3] which generates configurations non-uniformly 
but according to an artificially selected probability density function g{x) 
close to the integrand so that more probability mass is assigned to those 
configurations with higher probability [2H1]. In order to ensure the sam- 
pled configurations are still independent, it demands the primitive function 
G{x) of g{x) and its inverse function x{G). Unfortunately, it is not feasible 
to find such g{x) in most applications of interest. Rather than generat- 
ing independent configurations, the Metropolis method |5], which still uses 
the importance sampling idea, generates (possibly) correlated configurations 
from the original f{x) by Markov chains. Markov chains make the algorithm 
simple and universal. This method is known as Markov chains Monte Carlo 
(MCMC) method (see [Ij). Since the samples are correlated with each other, 
the variance of MCMC simulation with the same sample size is larger than 
the variance of the MC methods using independent configurations. In the 
rest of this article, we discuss only the MCMC method and refer it as simply 
the Monte Carlo (MC) method. 

The use of averages is common in scientific studies and many quantities 
related to thermal equilibria are averaged properties measured in real exper- 
iments over large numbers of particles and long time intervals. If the ergodic 
hypothesis applies to the system P], we can compute those quantities by 
ensemble averaging instead of time averaging using a partition function, an 
idea stemming from statistical mechanics. Monte Carlo method is a powerful 
tool based on ensemble averaging idea and so it can be used to calculate the 
quantities related to the thermal equilibrium state. 

A system with fixed particle number N, volume V, and temperature T 
can be described by canonical (constant-iVl^T) ensemble with the partition 
function containing only the coordinates of the N particles as independent 
variables. This description is valid for systems where the quantities of interest 
depend explicitly only on the location of all the particles. MC simulations of 
this system apply a random sequence of displacements to randomly selected 
particles. This random selection of particles and displacement is known as a 
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ttrial move. The sample sequence it forms generates a (correlated) Markov 
chain. The correlation of this sequence depends on the maximal random 
displacement applied, that is, the step size that determines the acceptance 
rate of the trial move. 

Most real experiments are carried out under conditions of controlled pres- 
sure and temperature. Thus the isobaric-isothermal ensemble (constant- 
NPT) is widely used in MC simulations where the particle location and the 
system's volume are randomly modified to visit all possible configurations 
according to their respective probabilities. Here, the step size of the volume- 
changing trial move also influences the correlation degree of the successive 
configurations. 

In adsorption studies where the chemical potential /i is fixed, instead 
of the particle number N, the grand-canonical (constant-/i^T ) ensemble 
is used to calculate the average particle number. The corresponding MC 
method includes a displacement trial move and trial insertion and removal of 
particles with step size usually fixed to one particle, that is, only one particle 
is tentatively inserted or removed from the volume each time. The accep- 
tance ratio of particle insertion and removal is very small thus it results in 
high correlation degree of the related successive configurations. This corre- 
lation degree cannot be reduced because the step size is already the minimal 
divisible unit, a particle. 

For the simulation of coexisting phases, important in many engineering 
applications, the MC algorithms based on the traditional ensembles described 
above suffer some important drawbacks. For example, limited computational 
resources imply that the number of particles used to represent the mixture 
is relatively small. Thus, a large fraction of all particles used reside in the 
vicinity of the mixture interface. This induces a bias towards the interfacial 
properties when ensemble averages are computed, rather than including a 
balanced representation of the bulk phases. 

In the literature several improvements to the traditional sampling have 
been proposed. In [B], a Gibbs-A^V^T MC method, where the total particle 
number, total volume, and temperature are fixed, was proposed to allevi- 
ate these algorithmic restrictions. This Gibbs-iVl^T scheme combines NVT, 
NPT and fiVT ensembles for simulating coexisting phases. This combination 
skillfully avoids the interface predominance by introducing two subsystems 
modeled as separate boxes. This model allows particles to swap from one 
phase (box) to the other while the potential energy between particles from 
different phases is neglected. Additionally, volume exchange are allowed be- 
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tween the two boxes while the total volume is conserved. The acceptance 
ratio of particle swap is very small, as was the case for the grand-canonical 
ensemble simulation. This limitation can be particularly severe when the 
density of one of the phases is very high. This limitation is important when 
modeling deposition and separation of dense liquids and solids. This draw- 
back is avoided in the Gibbs-Duhem integration method [S-E]. Nevertheless, 
the Gibbs-Duhem integration needs the initial point on the coexistence curve, 
thus it relies on the use of another method that can provide this initial point. 
If one of the coexisting phases is a crystal, the method proposed in \lQl im- 
proves the acceptance probability of exchanging particles. 

There are a lot of successful applications of the MC method based on 
Gibbs ensemble for water systems [H] and the oil production and process- 
ing [T2] - [T8] . In these applications the solubility of hydrogen sulfide and other 
corrosive components in the gas-hydrocarbon mixtures is important data. 
Nevertheless, this solubility is poorly understood due to lack of experimen- 
tal results. In Gibbs-A^\^T ensemble simulations of two coexisting phases, 
there are three kinds of trial moves: particle displacement, volume exchange, 
and particle swap. In order to reduce variance of the simulation results by 
decreasing the correlation degree of configurations, we adjust the step size 
for the first two trial moves. A discussion of the relationship between the 
variance and the step size of particle displacement is given in [2] but it is usu- 
ally difficult to obtain a general rule for such a relationship. Recently jT^ . 
the liquid-vapor coexistence of methane is first simulated by the Gibbs- A^\^T 
MC method and then the variation of mole fraction with pressure in a two- 
component system at phases coexistence state is studied by the Gibbs- iVPT 
MC method proposed in [20], where the total particle number, pressure and 
temperature are fixed. 

When Markov chains evolution is used for Monte Carlo simulation, it is 
not advisable to sample the system for the quantities of interest after each 
trial move. This requires too much memory while the correlation in the 
data is high; instead, the system is sampled at intervals (sampling interval). 
The larger the sampling interval is, the smaller the correlation degree will 
be. The same applies to the variance with fixed sample size (i.e., the total 
number of sampled cycles). The computational cost (i.e., simulation time) is 
almost proportional to the product of the number of samples collected and 
the sampling interval. Thus, increasing the sampling interval implies to either 
increase the simulation time when keeping the number of samples constant 
or to increase the variance of the results, when keeping the simulation time 
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constant. Nevertheless, our simulation results show that we can achieve a 
good trade-off between total simulation time and memory usage. In this 
paper, we describe the Gibbs-A^\^T MC method and employ it to model 
the coexisting phases of a Lennard- Jones (LJ) fluid. To make the problem 
tractable for theoretical analysis, we analyze the influence of the sampling 
interval and sample size on the variance of the simulation results for an 
idealized fluid, rather than the LJ fluid system. Finally, a general theoretical 
analysis is proposed to justify and prove some of the empirical observations 
and rules proposed. 



2. The Monte Carlo Method 

Suppose we wish to evaluate the following average: 

fn f(x)A(x)dx 

To compute (A), it is convenient to use the Monte Carlo method, based on 
Markov chains, to generate correlated conflgurations Xi after each cycle with 
a probability density proportional to f{x). Note that only f{x) and not its 
integral, f{x)dx, is used in the MC method. The conflguration xj at 
each sampled cycle is used to estimate the expected value {A) by the average 
value A = ^ Si=i ^(^j) ^ large sample set. 

2.1. Basic algorithm of MC method 

The algorithm [5J of Monte Carlo method using Markov chains for solving 
the general integral ([T]) can be summarized as follows: 

1. Initialization of conflguration; 

2. For each cycle: 

(a) Apply trial move algorithm; 

(b) Apply acceptance criterion; 

3. Sample the system at regular intervals; 

4. Stop after getting sufficient samples for analysis. 

The initial conflguration can be selected randomly from within the domain fl^ 
of the deflnition of the conflguration space. The Markov chain is generated 
by randomly modifying the current conflguration x into x' by applying the 
trial move algorithm. 
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The algorithm outhned in steps [T] to |4] should satisfy the ergodicity and 
time-reversal conditions. The ergodicity condition requires that from the 
current configuration x it is possible to visit any x' G fi^ by a limited number 
of trial moves. The time-reversal condition requires that the probability for 
the system to change back to its previous state is larger than zero. The 
probability density of the trial move event {x — )■ x') is denoted by a;(a; — >■ 
x'). The algorithm can be simplified significantly by using the symmetric 
condition that requires that the probability of the trial move from x to x' is 
equal to the probabihty of the reverse move, that is, a{x — )■ x') = a{x' — )■ 



x). Any configuration generated in step 2a will be accepted or rejected in 
step |2b] based on the following acceptance criterion: the new configuration 
x' is accepted if Rf (random number distributed uniformly inside [0, 1]) is 
less than 

a{x' — x)f{x') 
a{x ^ x')f{x) 

or rejected otherwise. This means that the acceptance probability equals to 

, . aix' — )■ x)f(x') 
acdx — 7- x ) = mtn 1, — ; , , 

This selection for the acceptance probability is based on the detailed balance 
condition for equilibrium state that can be stated as 

f{x)a{x — 7- x')acc{x — )■ x') = f{x')a{x' — > x)acc{x' — )■ x) 

and also on the fact that 

min [1,13] ^ ^ 
min [1, 

We note that the detailed balance condition is a sufficient but not a necessary 
requirement and that in [21] it was shown that the weaker balance condition 
is a sufficient requirement. 

Samples are collected in step 3 after the simulation has reached the statis- 
tically steady state, that is, after an initial transitional period. The quantities 
of interest are estimated from samples collected every d cycles. 

2.2. Algorithm based on Gibbs-NVT ensemble 

In the Gibbs-A^yT ensemble [B], as described in [5], the probability den- 
sity function and the related partition function are expressed as: 

f(N V ^^"^M - yf'iy-yir-'''^M-HUi+u2)] 
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(c) particle swap between boxes 
Figure 1: Schematic model for trial moves of Gibbs- iVl^T ensemble. 
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and 



N 

Qg{N,V,T)= 5^ / 

Afi=0 




^ 1 J ^ 2 



dVi (3) 



where T is the temperature of both boxes, A^^i is the particle number inside 
box 1, Vi is the volume occupied by box 1, and are the positions 

1/3 1/3 

of the A^i and N2 = N — Ni particles normalized by the size V;'-" and 
of box 1 and 2, respectively, A = h/ A/27rm//3 is the thermal de Broglie 
wavelength with /3 = 1/(/cbT), h is the Planck constant, ks is the Boltzmann 
constant, Ui = Ui{~^i^ ,Vi) is the total potential energy of box 1, namely 
a summation of pair potential energy Uij contributed by particles i and j 
inside box 1. The expressions for the probability density function ^ and the 
related partition function (|3| are obtained after completing the integration 
with respect to the momentum variables. For LJ fluid, we have: 



Ui 



ULj{r) = 4e 
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a 



(4) 



where r = — and l^i are the coordinates of particle i. To simplify 



our computations, we can substitute Eq. Q by a truncated potential 



ULj{r), r < re] 



0, 



r > rr. 



(5) 



with correction by 



U 



tail 
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- -3 - 



for the total energy in the corresponding box due to contributions beyond the 
cutoff distance Tc- In MC simulations, it is convenient to use non-dimensional 
quantities. The resulting non-dimensional system is defined by the follow- 
ing normalized quantities: number density p* = pa^, pressure p* = pa^e, 
temperature T* = Tks/e, and energy u* = u/e. 

As the probability distribution function of Eq. ^ contains three inde- 
pendent variables, three kinds of trial moves are used in the related MC 
algorithm: particle displacement, volume exchange, and particle swap (see 
Fig. [1]). The acceptance probability for the three kinds of trial moves can 
be obtained according to the detailed balance condition mentioned above in 
section 12. 1[ 
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cycle 



Figure 2: Evolutions of normalized densities, T* = 0.9. 

2.3. MC simulation using Gibbs-NVT ensemble 

We simulate phase coexistence of an LJ fluid by the Monte Carlo method 
based on Gibbs-A^^T ensemble. The cutoff distance for the two boxes is 
fixed at 45% (smaller than a half) of the relative box size. This box sizes are 
modified in each volume exchange trial move. One thousand particles are 
used in our simulation and the initial normalized density of the two boxes is 
Pinit = 0-3 unless otherwise stated. 

In each cycle a trial move is applied. This trial move is selected randomly 
from three possible kinds. The probability for selecting the displacement trial 
move is 0.9, but it is 0.01 for volume exchange and 0.09 for particle swap. 
After a transitional period (about Lmu = 2 x 10^ cycles for the current 
simulations), we sample the system every 50 cycles, that is, d=50. 

The initial values of As and AV are chosen to be 0.1 (see Fig. [s]). In order 
to make the acceptance ratios of the related trial moves approach to user- 
defined values, the step sizes are modified according to an auto-adjusting 
algorithm (see the source code mentioned in the preface of [2]) using the 
collected information. These step sizes are reset at the end of each Ladjust = 
5 X 10^ cycles. The self-adjusting procedure utilized ensures that by the 
completion of the initial Linu cycles, the step sizes of the different trial moves 
are such that the acceptance ratios of those trial moves are approximately 
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Figure 3: Evolutions of normalized volumes, T* = 0. 
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Figure 5: Evolutions of acceptance ratios and step sizes, T* = 0.9. 

the predetermined value of 0.5. Once the first Linu steps are executed, the 
step sizes are kept fixed for the remaining of the computation. These fixed 
step sizes ensure that the following trial moves are symmetric. The reported 
average values are computed using 2^^ correlated samples. 

For T*=0.9, it shows in Figs. |2]j4] that the normalized density, volume, 
and pressure of the two boxes reach convergence after the predetermined 
Linit = 2 X 10^ cycles. Notice that before Linu time steps are complete, the 
step sizes As, AV are adjusted and the related achieved acceptance ratios 
are changed correspondingly and finally approach to the predetermined value 
of 0.5 as shown in Fig. [5j After Linu cycles, the step sizes are fixed to their 
latest values and the related acceptance ratios fluctuate around 0.5 as desired. 
Fig. |5] also shows that the acceptance ratio of particle swap between boxes 
is only about 0.0016 because the density of box 1 is very high (see Fig. [2]) 
and this situation will get worse as the density increases. As discussed in the 
introduction, this acceptance ratio cannot be improved although it results 
into high correlation degree of the successive configurations. The results of 
p* for different T* are shown in Fig. [6] including comparison with results 
computed using the state of equation and MC simulations [2]. 

As shown in Figs. [2]j4], it seems that the statistical noise of the simulation 
results in the dense-phase box is larger than in the other box. A similar 
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Figure 6: Phase diagram of Lennard- Jones fluid. 
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Figure 7: Evolutions of normalized densities, T* = 1.25. 
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observation is made in [2]. For example, the simulation results of T* = 1.25 
with initial density still 0.3 are shown in Figs. [^j9] where we observe that the 
intensity difference of statistical noise of the two phases is reduced with the 
decrease of the density difference. 

3. Blocking Method for Estimating the Variance 

In MC simulations, each sample X{ IS cL measurement of a random variable 
X with exact but unknown probability distribution, from which we define the 
expected value (x). If the measurements can be taken as independent, the 
variance a^{x) of the average quantity x = ^ SILi -^^ inversely proportional 
to the size n of the sample set. But, if they are correlated with each other, 
the variance also depends on the sampling interval d between two successive 
samples. 

In the blocking method [23], the following transformation is employed to 
decrease the sample size till n' = 2 

X\ = {X2i-l + X2i)/2 

n' = n/2 

After each blocking step, we get a new value for 

1 1 _ 

^ ' 1=1 

which will increase during the blocking process and approximates cr^(x) if 
convergence is achieved. The value at the convergence point is used to es- 
timate the variance of the average value. If the blocking process does not 
converge, the largest value during the blocking process is a lower bound of 
the variance [23] • Convergence will happen if the sample set covers a time 
span which is several times larger than the maximal correlation interval r of 
the sample set so that the 'blocking' variables x[ at the convergence point 
are independent Gaussian variables. The subtlety of the blocking method is 
to decrease the correlation degree of the new sample set {x'j)i=i^... making 
the correlated functions 7^'^ = (x[x'j) — (x^) (x^) ,i ^ j tend to zero. 

4. Influence of Simulation Parameters on the Variance 

We take the set of samples after each trial move as the full sample set. If 
the trial move is accepted, the next sample is different from the previous one 
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Figure 10: Variance estimations by the blocking method, T* = 0.9, n = 2^^, 
PI and P2 are the pressure variance in boxes 1 and 2, respectively, D\ and 
D2 are the density variance in boxes 1 and 2, respectively, and VX and V2 
are the volume variance in boxes 1 and 2, respectively. 



but if rejected, the configuration remains unchanged and the next sample is 
the same as the previous one. The repeated samples induce high correlation 
in the set. These repeated samples are reasonable from the point of view 
of statistics but contain little useful information. The lower the correlation 
degree is, the smaller the variance with a given sample size will be. Instead 
of sampling after each trial move, we could add a sample to the set after d 
cycles, for example. The new sample set will be referred to as coarse sample 
set, which is a subset of the full sample set. We can reduce the correlation 
degree of the coarse sample set by increasing d and its size is denoted by 
n. In MC simulations, only the coarse sample set is stored and the memory 
or disk usage requirements can be reduced significantly compared to storing 
the full sample set if d is much larger than one. The average value and the 
corresponding variance are calculated using the coarse sample set. 

In the above simulation of a LJ fiuid with T* = 0.9, we observed that the 
statistical noise of the simulation results in the dense-phase box is larger than 
in the other box. Fig. 10 shows that the variance results estimated by the 
blocking method. The variances of the normalized density, pressure of box 1 
with dense phase are larger than those of box 2 (the final wild fiuctuation is 
due to numerical instabilities when v! becomes too small) but their volume 
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variances are the same (the total volume V is fixed), which is consistent with 
the data shown in Figs. |2]j4j 

The CPU time is proportional to the total cycle times Lfotai which is 
almost equal to n x d [Ltotai = -^imt + n x d but the cycle times Li^u before 
convergence is negligible). We discuss the infiuence of n and d on the variance 
in what follows. The rules we obtain are expected to be independent on the 
particular MC simulation used to generate the correlated sample set. For 
simplicity, an artificial ideal system, which is much simpler than the LJ 
system, is used in following simulations. 

In the ideal system, the particle number is equal to 12 and particle 
coordinate only takes integral numbers s = ±1 as in the Ising model and the 
probability distribution function is: 

HiV oc exp[-t/.W)]exp[-t/.(4-''-)] 

" ' ' 2 ' N,\(N-N-,)\ ^ ' 

where U = —JYli>j^i^j is a summation spanning over all pairs (particle's 
periodic images are neglected here) located inside the same box and J = 0.1 
so that the acceptance ratio is not too small. For this model, we only need 
the spin trial move and the trial move of particle swap. The properties of the 
two boxes are equivalent, thus the correlation degree of their sample sets is 
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the same. The MC simulation results show that blocking processes of the U 

sample sets from the two boxes are very similar as shown in Fig. ITT] During 

— 

the blocking process, only the evolution of at the initial stage provides 
useful information relative to the approaching process to the variance. As 
n' shrinks, the evolution of the blocking process becomes unstable, leading 
to wild fluctuations. These fluctuations can be arbitrarily large increasing 
or decreasing the computed value. Those fluctuations are due to numerical 
instabilities and only serve to bound the trust worthy region of the blocking 
computation. These instability do not cause any problem if converges 
before losing stability (see Fig. 11 (left)). Thus, the value at the convergence 
point can be used to estimate the corresponding variance. But, in some cases 
where (^^'-i) cannot converge before losing stability (see Fig. 11 (right)), the 
problem is that it is difficult to judge where the separation pomt of the two 
stages is located, thus the lower bound of the variance, the largest value before 
losing stability, is unknown. When using two sample sets with the same 
correlation degree, their initial stage should be the same and the final stage 
with drastic fiuctuation is random, and so it is easy to find the separation 



point of the two stages. As shown in Fig. 11 (left) using n = 2^^ samples, 
the two curves overlap with each other and deviate after blocking 14 times 
which is the separation point. As it converges before the separation point, the 



variance for this two sample sets is about 3.3 x 10~ . In Fig. 11 (right) using 
only 2^° samples, the two curves overlap with each other before blocking 
10 times which is the separation point but are still not converged at the 
separation point and so, the lower bound of the variance is about 1.84 x 10~^, 
which is the largest value before losing stability. Using different sample sets 
with similar correlation degrees simplifies the computation of the variance 
lower bound, nevertheless in real MC simulations this would incur prohibitive 
computational demands, both memory and CPU time. Thus, as shown in 
Fig. 11, we propose to use the first maximal point in the blocking process 
as the separation point and use it to estimate the variance lower bond. This 
observation is justified by the fact that . ,° is a non-decreasing quantity 
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loss of stability of the blocking computation. 

Table [T] displays the variance for different combinations of n and d. The 
rows of table [T] correspond to a fixed sampling interval, d, which implies that 
the correlation degree for the coarse sample set is also fixed. As it can be 
observed on the table [T| for fixed d the variance is inversely proportional to 
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Table 1: Variance of Markov Chains Monte Carlo simulation results with 
different sample size n and sampling interval d 
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*Note: "Not yet converged" refers to simulations where the blocking process be- 
comes unstable for coarse samplings before achieving a definite maximum, as shown 



in Fig. 11 (right). 



the sample size n. This feature is well known for independent sample sets 
but deserves further theoretical analysis for a general sample set. The CPU 
time is almost proportional to n x as mentioned before. Thus, the variance 
is also almost inversely proportional to the CPU time and so it is the most 
rewarding choice to increase n in view of CPU time. Using V{n, d) as the 
variance at {n,d), we observed on the table [l] that: 

V{n2, d) ^ ni 

V{n^,d) n2 ^ ' 

Although increasing n is an efficient choice for reducing the variance in view 
of CPU time, it has onerous cost for memory or disk usage. For d = 1, 
the variance becomes very small only if n is huge which makes the memory 
requirement unacceptable. In order to reduce the variance while keeping the 
memory or disk usage low, we decrease the correlation degree of the coarse 
sample set by increasing d. For n = 2^^, the variance decreases to about a 
quarter of the previous value when d increases from 1 to 4, which is almost 
as efficient as increasing n in view of CPU time which is also increased four 
times. But, when d keeps on increasing from 16 to 64 with the CPU time 
being increased four times again, variance is reduced to 0.47 (instead of the 
ideal value 0.25) times of the previous value from 9.1 x 10^^ to 4.3 x lO"*^. 
This is not the most rewarding choice in view of CPU time because we can 
choose to increase n from 2^"^ to 2^^ with d fixed at 16 and CPU time also 
increased four times but then the variance decreases to about 0.25 times of 
the previous value from 9.1 x 10"^ to 2.3 x 10"^ as already pointed out above. 



18 



The following theoretical analysis can further prove that: 

^<^<l (10) 

da V{n,di) 

where d2 > di and equality holds when the samples of the coarse sample 
set of 0?! are already independent and so the correlation degree of the coarse 
sample set cannot be further reduced by increasing d. 

Usually, we also want to know how to reduce the variance for a given 
CPU time, namely n x d. In the case of small d, the magnitude of variance 
is inversely proportional to the CPU time. The larger the CPU time is, the 
smaller the variance will be. Given the same CPU time, the larger the sample 
size n is (the memory or disk usage for recording those samples is also larger) , 
the smaller the variance will be. But in the case of large d, the magnitude of 
the variance depends more on the n and in the limit case it depends only on 
n and independent on d. In fact, these laws are nothing new compared with 
Eq. (|9])-(10), according to which we have: 



V{ni,di) n2V{n2,di) 

and then 



nidi ^ Vin2,d2) ^ ni 
712^2 V{ni,di) ~ n2 



For ni X di = 722 X (^2 corresponding to the same CPU time, Eq. (12) can be 
replaced by a special form using a new variable V'{CPUtime, d): 

V'{CPUtime,d2) ^ ni 
^ V'{CPUtime,di) - n2 ^ ' 



where d2 > di as required in Eq. (10). 



5. Theoretical Analysis 

In section |4| we discussed the relationship between the variance and the 
sample size n and sampling interval d have some independent laws, namely 
Eq. ([9])-(10). These laws are independent of the blocking method used to 



calculate the variance. These laws reflect the underlying feature of the sta- 
tistical laws. In this section a theoretical analysis is presented to validate 
these observed laws. 
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r =0 



Figure 12: of sample sets with different correlation degree. 

Let Xi,X2, - ■ ■ , x„ be the results of a consecutive measurement of a random 
variable a; in a Monte Carlo simulation in thermal equilibrium state, which 



means: 



(xi) = {xi+t) , yt 

(XiXj) - (Xi) (Xj) = (XiXm) - (xi) {Xm) , \i - j\ = \l - m\ (14) 



where (■ ■ ■ ) denotes the expected value with respect to these exact but un- 
known probability distributions. In MC simulations, we estimate the ex- 
pected value {x) by the average quantity x = ^ Yll=i ^« ^^"^ variance of 
X is: 



aHx) 



1 

^2 



7: 



n 



n-l 



7o 



t=i 



t 



n 



It 



(15) 



where = (xiXj) — (xi) {xj) and 7t = 7ij,t = \i — j\. In Monte Carlo 
simulation, it is reasonable to assume: 



n 

-y 



a^ix) 



1 



o'^ix) > 



(16) 



where the equality holds when the samples are independent with each other. 
Fig. 12 shows some representative results of jij in usual MC simulations, 
(left) shows the results of a high-correlation sample set compared 
12 (middle). In the limit case where each sample is an independent 



12 



Fig. 
to Fig. 

random variable, jij is equal to a constant for i = j and zero otherwise as 
shown in Fig. 12 (right). 



20 



Theorem 5.1. Assume that the sample size n is much larger than maximal 
correlation interval r, then the variance a'^ix) with fixed sampling interval d 
is inverse proportional to n. 

Proof. For a given d, the correlation degree is fixed and we always can make 
an estimation for r where 7^ 0, t > t (see Fig. 12). Then, Eq. (15) is 
simphfied to 



1 

n 



t=i ^ ^ 



It 



n 



7o 



1- 



t=i 



t 



n 



It 



(17) 



Assuming the sample size n is large enough such that n ^ r, we conclude 
that 



cr [x] 



n 



+ 

t=i 

which is consistent with Eq. ([9]). 



It 



n 



7o 



It 



t=i 



□ 



Remark 5.1. The relationship between the variance and the sample size in 
Theorem 5.1 is well-known for independent sample set hut holds for correlated 
sample set only if the sample size t. In fact, n ^ t means that the length 
of contours with nontrivial value in Fig. 12_ is proportional to the sample size 
n and so 'Y^^j=ili,j ^■^ o^^^o proportional to n making a'^ix) = ■^'Y^^ j=ili,3 
inverse proportional to n. 

Theorem 5.2. Given two sample sets with the same sample size n hut differ- 
ent sampling intervals, di and d2 (^2 > di), respectively, then their variances 
satisfy 

di ^ (n, d2) ^ ^ 
(^2 0"^ (n, di) ~ 

Proof. Here, we first discuss two sample sets: Ya = {yi \yi = Xi,i = 1, 2, ■ ■ ■ ,nd} 
containing n x d samples and = {yi \yi = i = 1,2, ■ ■ ■ ,n} con- 

taining n samples generated once from each d samples of Ya. According to 
Eq. (15), we have: 



'iVa) 



{ndy 



End 
i,j=l 7i 



7ij 
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Figure 13: Schematic model for the summations with d = 4. 



As shown in Fig. 13, ^"^=1 is summation over all vertexes (without re- 
peating) of small quadrilaterals but ^ij=(fc_i)d+i is summation over only the 

fc — 1, ■ - ■ ,n 

bottom-left vertexes of larger quadrilaterals which are marked by red and 
blue color. 

In the area Quue of each blue quadrilateral centered at the maximum 
value of 7t (see Fig. fl3]), it can be observed that 



d 



< 



7i 



(20) 



i,j = {fc-l)d+l 



i,j = {k-l)d+l 
'''£^6! tie 



where the equality holds when 7^ = 0, t > 0. This can be understood by 



considering one of the blue quadrilaterals in Fig. 13 Then, realizing that the 
leftmost summation, ^t,j={fe_i)d+i 7j only contains the bottom left-hand 

corner of the quadrilateral, namely the maximum value which lies on the 
diagonal. Multiplying this maximum value by d will be lower or equal to 
y^ioco 7i7! having d maximum terms and other terms with smaller but 
still positive values. The second part of the inequality stems from the fact 
that d'^ is multiplying one maximum term, and this will always be greater 
than ^ . jgQj^j 7j,j, having d^ terms but only d terms taking maximum values. 
In the area ^red of those red quadrilaterals located always at the monotone 
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interval of 7t, we can use linear interpolation leading to the following esti 
mat ion: 



(21) 



According to Eqs. (20)-(21), we have 



i,j = (fc-l)d+l 



nd 



d Yl '^^'^ + Y '^^'j - Y < Y 



(22) 



i,j=(k-i)d+i 



i,j = {fc-l)d+l 



i,j = (k-l)d+l 
fc— 1,- ■ ■ ,n 



At this point we assume Eq. (16) valid, thus we get {d > 1 and d > d): 

nd 

d Y - Y ^^'3 < Y 



(23) 



ij=(k-i)d+i 

A; — 1 , ■ ■ ■ ,n 



i,j = {fc-l)d+l 
fc — 1 , ■ ■ ■ ,n 



where the equality holds also at the same condition of 7* = 0, t > 0. Substi 

ve 



tuting Eq. (23) into Eq. ( |19[ ), we have 



d 



(24) 



Now introducing = {yi\yi = Xi,i = 1,2, ■■■ ,n} containing n samples as 
Yb but having the same correlation degree as Ya and taking Eq. (18) into 
consideration, we have: 



nd 



d 



n 



d 



with which we finish the proof. 



< 



(25) 



< 1 



□ 



Remark 5.2. Taking the sample set with di in Eq. (10) as Y^ and the other 
as Yh, it is easy to see that Eq. (10) is equivalent to Eq. (25) proved here. Note 



that if the sample set Ya ( namely Y^) has a high correlation degree making the 

/2(jj converges 



summation over the area ^red dominant (see Fig. 12 (left)), ^5^^ 



to 2 according to Eq. (21). But in contrast, ^2(|^ = 1 if the sample in Ya are 
independent. 
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6. Conclusions 



The influence of the sample size and sampling interval used in MC sim- 
ulations on the variance of the average quantities is analyzed by numerical 
results and proved by theoretical analysis. In the case of large sample size, 
the variance with fixed sampling interval is inversely proportional to the 
sample size and the CPU time. For a given CPU time, the memory or disk 
usage (namely the sample size) can be reduced greatly by increasing the 
sampling interval and sometimes the corresponding increase in the variance 
is negligible. 

In the implementation of the blocking method, the blocking process is 
subject to increased fluctuations when the sample size n' is reduced; in par- 
ticular, the fluctuation gets its worst value when n' approaches to two. The 
current results show that the fluctuation starts near the flrst maximal point 
obtained during the blocking process. Additionally, the corresponding max- 
imal value can be used as an estimate of the variance if the blocking process 
converges or as a lower bound estimate of the variance if the blocking process 
does not converge. 
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